Analysis of Hadoop’s Performance under Failures

نویسندگان

  • Florin Dinu
  • T. S. Eugene Ng
چکیده

Failures are common in today’s data center environment and can significantly impact the performance of important jobs running on top of large scale computing frameworks. In this paper we analyze Hadoop’s behavior under compute node and process failures. Surprisingly, we find that even a single failure can have a large detrimental effect on job running times. We uncover several important design decisions underlying this distressing behavior: the inefficiency of Hadoop’s statistical speculative execution algorithm, the lack of sharing failure information and the overloading of TCP failure semantics. We hope that our study will add new dimensions to the pursuit of robust large scale computing framework designs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop

Mochi, a new visual, log-analysis based debugging tool correlates Hadoop’s behavior in space, time and volume, and extracts a causal, unified controland dataflow model of Hadoop across the nodes of a cluster. Mochi’s analysis produces visualizations of Hadoop’s behavior using which users can reason about and debug performance issues. We provide examples of Mochi’s value in revealing a Hadoop jo...

متن کامل

Robust Control of a Quadrotor in the Presence of Actuators' Failure

Today, robots and unmanned aerial vehicles are being used extensively in modern societies. Due to a wide range of applications, it has attracted much attention among scientists over the past decades. This paper deals with the problem of the stability of a four-rotor flying robot called quadrotor, which is an under-actuated system, in the presence of operator or sensor failures. The dynamica...

متن کامل

Hadoop’s Overload Tolerant Design Exacerbates Failure Detection and Recovery∗

Data processing frameworks like Hadoop need to efficiently address failures, which are common occurrences in today’s large-scale data center environments. Failures have a detrimental effect on the interactions between the framework’s processes. Unfortunately, certain adverse but temporary conditions such as network or machine overload can have a similar effect. Treating this effect oblivious to...

متن کامل

Analysis of an M/G/1 Queue with Multiple Vacations, N-policy, Unreliable Service Station and Repair Facility Failures

This paper studies an M/G/1 repairable queueing system with multiple vacations and N-policy, in which the service station is subject to occasional random breakdowns. When the service station breaks down, it is repaired by a repair facility. Moreover, the repair facility may fail during the repair period of the service station. The failed repair facility resumes repair after completion of its re...

متن کامل

1Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop

Mochi, a new visual, log-analysis based debugging tool correlates Hadoop’s behavior in space, time and volume, and extracts a causal, unified controland dataflow model of Hadoop across the nodes of a cluster. Mochi’s analysis produces visualizations of Hadoop’s behavior using which users can reason about and debug performance issues. We provide examples of Mochi’s value in revealing a Hadoop jo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011